R Validation Hub

Status Report & Workshop

Doug Kelkhoff

2023-09-18

👋 Who We Are

The R Validation Hub is a collaboration to support the adoption of R within a biopharmaceutical regulatory setting (pharmaR.org)

  • Grew out of R/Pharma 2018
  • Led by participants from ~10 organizations
  • With frequent involvement from health authorities (primarily the FDA)
  • And subscribers from ~60 organizations spanning multiple industries

🤝 Affiliates: PSI/AIMS (CAMIS)

Comparing Analysis Method Implementations in Software
A cross-industry group formed of members from PHUSE, PSI, and ASA.

  • Released a white paper providing guidance on appropriate use of stats methods, for example:
    • Don’t default to the defaults
    • Be specific when drafting analysis plans, including precise methods & options
  • A resource for knowing the details of methods across languages

🤝 Affiliates: PSI/AIMS (CAMIS)

CAMIS Comparisons Resources
Methods R SAS Comparison
Summary Statistics Rounding R SAS R vs SAS
Summary Statistics R SAS R vs SAS

🤝 Affiliates:

Works with and provides support to the R Foundation and to the key organizations developing, maintaining, distributing and using R software

Key Activities

  • The R Validation Hub
  • R Submission Working Group
  • R Repositories Working Group (ie CRAN enhancements & future)

👷‍♂️ What We Do (pharmaR.org)

Products

White Paper

Guidance on compliant use of R and management of packages

New! Repositories

Building a public, validation-ready resource for R packages

Coline Zeballos

New! Communications

Connecting validation experts across the industry

Juliane Manitz

{riskmetric}

Gather and report on risk heuristics to support validation decision-making

Eric Milliman

{riskassessment}

A web interface to {riskmetric}, supporting review, annotation and cataloging of decisions

Aaron Clark

New! {riskscore}

An R data package capturing risk metrics across all of CRAN

Aaron Clark

📊 A Quick Survey

Keep your hand raised if…

  • It’s early morning and you need an excuse to stretch
  • This is your first time hearing about the R Validation Hub
  • You’re missing Andy’s posh accent
  • Your org contributes to the R Validation Hub
  • Your org leverages the R Validation Hub guidelines
  • Your org uses R Validation Hub tools ({riskmetric}, {riskassessment})

🗓️ Agenda

  • Updates 20min
  • Established Workstream Recap 10min
    past, present & future
  • Leaps of Faith: Setting the Tone for our Future 20min
  • Open Discussion
    • What’s Next? 20min
    • Design Lab 10min
  • Closing

📣 Updates

Change of Leadership

  • You may have noticed that I am not Andy Nicholls.
  • Last year, Andy decided to step down to focus on his growing responsibilities as Head of Data Science at GSK

Pulse Check

  • We looked back on how we had been working
  • Identified new opportunities
    1. Refining holistic strategic direction
    2. Be mindful about communication and organization

📜 Workstream Report

R Validation Hub Case Studies

{riskmetric}

{riskassessment}

📦 Repositories Workstream

Repositories Workstream

Supporting a transparent, open, dynamic, cross-industry approach of establishing and maintaining a repository of R packages.

  • Taking ample time to engage stakeholders
    • Validation leads across the industry
    • Active health authority involvement
    • Analytic environment admins and developers
  • Considering the possibilities
    • Mapping needs to solutions that meet the industry where it is
    • …while building the path for it to move forward

How did we get here?

  • Our whitepaper is widely adopted
  • But implementing it is inconsistent & laborious
    • Variations throughout industry are a risk
    • Sharing software with health authorities is a challenge to any internal solution
    • Health authorities, overwhelmed by technical inconsistencies, are more likely to question software use
  • We feel the most productive path forward is a shared ecosystem

Old dog, new trick

  • Modern package ecosystems are the stats world’s new trick
  • Methods are provided directly by statisticians and academics, rarely by vendors.
  • Risk is managed not by itemized requirements, but by good development practices.1
  • We need to learn how to manage risk in a constantly evolving ecosystem

Different strokes

Vendored Stats Products
Data Science Ecosystem
  • a
  • b

What does a solution look like?

If it’s not broke, don’t fix it!

  • R has this wonderful thing called CRAN, setting the standard of quality
    • Packages are constantly tested together
    • R has a culture of amazing documentation
    • Statisticians flock to R, and are constantly vetting its implementations

What does a solution look like?

Fool me twice, shame on me

  • R has this thorn in its side called CRAN,
    • Builds are difficult to reproduce (key for validation)
    • Quality indicators are lacking
    • Difficult to roll back to an older snapshot (although tools exist to help with this.)
    • Governance isn’t always the most friendly

What does a solution look like?

Closing the CRAN gap for the Pharma Use Case

  • Reproducibility guidelines
  • Standard, public assessment of packages
  • Avenues for communicating about implementations, bugs, security

Repositories Workstream

Work to-date

  1. Stakeholder engagement 3mo
  2. Product refinement and proof-of-concept planning 1mo
  3. POC development 2mo